Towards Spam Detection at Ping Servers
نویسندگان
چکیده
Spam blogs, or splogs feature plagiarized or auto-generated content. They create link farms to promote affiliates, and are motivated by the profitability of hosting ads. Splogs infiltrate the blogosphere at ping servers, systems that aggregate blog update pings. Over the past year, our work has focused on detecting and eliminating splogs. As techniques used by spammers have evolved, we have learned how splog signatures are tied to tools that create them, that they are beginning to be a problem across languages, and that they require a much quicker assessment. Though we continue to address these specific challenges, we discuss our larger goal in this work, of developing a scalable meta-ping filter that detects and eliminates update pings from splogs. This will considerably reduce computational requirements and manual efforts at downstream services (search engines) and involve the community in detecting spam blogs.
منابع مشابه
Towards Improving E-mail Content Classification for Spam Control: Architecture, Abstraction, and Strategies
This dissertation discusses techniques to improve the effectiveness and the efficiency of spam control. Specifically, layer-3 e-mail content classification is proposed to allow e-mail pre-classification (for fast spam detection at receiving e-mail servers) and to allow distributed processing at network nodes for fast spam detection at spam control points, e.g., at e-mail servers. Fast spam dete...
متن کاملRestraining transmission of unsolicited bulk e-mail
Filtering large amounts of unsolicited bulk e-mail, also known as spam, is expensive. Either because of the time spent on manual deletion or complex analysis by algorithms putting heavy load on e-mail servers. Due to exponential growth of the volume of spam campaigns, Internet Service Providers (ISP’s) are increasingly forced to use rigorous rejection policies to prevent their filter servers fr...
متن کاملReal-time statistical rules for spam detection
Spam detections fall into two categories: rule-based and statistical-based. The former refers to the detection which is performed by looking for spam-liked patterns in an email. Since the rules can be shared, they have been popularized quickly. The rules, however, are built manually it is hard to keep them up with the variation of spam. The statistical-based method, on the other hand, is possib...
متن کاملCharacterizing the Splogosphere
Weblogs or blogs collectively constitute the Blogosphere, forming an influential and interesting subset on the Web. As with most Internet-enabled applications, the ease of content creation and distribution makes the blogosphere spam prone. Spam blogs or splogs are blogs hosting spam posts, created using machine generated or hijacked content for the sole purpose of hosting ads or raising the Pag...
متن کاملSurvey on Text Classification (Spam) Using Machine Learning
E-mail spam is a very serious problem in today’s life. It has many conséquences like it causes lower productivity, occupy space in mail boxes, extend viruses, Trojans, and materials containing potentially harmful information for a certain category of users, Destroy stability of mail servers, and as a result users spend a lot of time for sorting incoming mail and deleting undesirable corresponde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007